Word Searching in Document Images Using Word Portion Matching
Identifieur interne : 001832 ( Main/Exploration ); précédent : 001831; suivant : 001833Word Searching in Document Images Using Word Portion Matching
Auteurs : Yue Lu [Singapour] ; Lim Tan [Singapour]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: An approach with the capability of searching a word portion in document images is proposed in this paper, to facilitate the detection and location of the user-specified query words. A feature string is synthesized according to the character sequence in the user-specified word, and each word image extracted from documents are represented by a feature string. Then, an inexact string matching technology is utilized to measure the similarity between the two feature strings, based on which we can estimate how the document word image is relevant to the user-specified word and decide whether its portion is the same as the user-specified word. Experimental results on real document images show that it is a promising approach, which is capable of detecting and locating the document words that entirely match or partially match with the user-specified word.
Url:
DOI: 10.1007/3-540-45869-7_37
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000413
- to stream Istex, to step Curation: 000406
- to stream Istex, to step Checkpoint: 000F46
- to stream Main, to step Merge: 001912
- to stream Main, to step Curation: 001832
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Word Searching in Document Images Using Word Portion Matching</title>
<author><name sortKey="Lu, Yue" sort="Lu, Yue" uniqKey="Lu Y" first="Yue" last="Lu">Yue Lu</name>
</author>
<author><name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:19435F2C25ADE5DC5D73CA65197979D6C31294B6</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_37</idno>
<idno type="url">https://api.istex.fr/document/19435F2C25ADE5DC5D73CA65197979D6C31294B6/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000413</idno>
<idno type="wicri:Area/Istex/Curation">000406</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F46</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Lu Y:word:searching:in</idno>
<idno type="wicri:Area/Main/Merge">001912</idno>
<idno type="wicri:Area/Main/Curation">001832</idno>
<idno type="wicri:Area/Main/Exploration">001832</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Word Searching in Document Images Using Word Portion Matching</title>
<author><name sortKey="Lu, Yue" sort="Lu, Yue" uniqKey="Lu Y" first="Yue" last="Lu">Yue Lu</name>
<affiliation wicri:level="1"><country xml:lang="fr">Singapour</country>
<wicri:regionArea>Department of Computer Science, School of Computing National University of Singapore, 117543, Kent Ridge</wicri:regionArea>
<wicri:noRegion>Kent Ridge</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Singapour</country>
</affiliation>
</author>
<author><name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
<affiliation wicri:level="1"><country xml:lang="fr">Singapour</country>
<wicri:regionArea>Department of Computer Science, School of Computing National University of Singapore, 117543, Kent Ridge</wicri:regionArea>
<wicri:noRegion>Kent Ridge</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Singapour</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">19435F2C25ADE5DC5D73CA65197979D6C31294B6</idno>
<idno type="DOI">10.1007/3-540-45869-7_37</idno>
<idno type="ChapterID">37</idno>
<idno type="ChapterID">Chap37</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: An approach with the capability of searching a word portion in document images is proposed in this paper, to facilitate the detection and location of the user-specified query words. A feature string is synthesized according to the character sequence in the user-specified word, and each word image extracted from documents are represented by a feature string. Then, an inexact string matching technology is utilized to measure the similarity between the two feature strings, based on which we can estimate how the document word image is relevant to the user-specified word and decide whether its portion is the same as the user-specified word. Experimental results on real document images show that it is a promising approach, which is capable of detecting and locating the document words that entirely match or partially match with the user-specified word.</div>
</front>
</TEI>
<affiliations><list><country><li>Singapour</li>
</country>
</list>
<tree><country name="Singapour"><noRegion><name sortKey="Lu, Yue" sort="Lu, Yue" uniqKey="Lu Y" first="Yue" last="Lu">Yue Lu</name>
</noRegion>
<name sortKey="Lu, Yue" sort="Lu, Yue" uniqKey="Lu Y" first="Yue" last="Lu">Yue Lu</name>
<name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
<name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001832 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001832 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:19435F2C25ADE5DC5D73CA65197979D6C31294B6 |texte= Word Searching in Document Images Using Word Portion Matching }}
This area was generated with Dilib version V0.6.32. |